126 PART 3 Getting Down and Dirty with Data
If you can’t find any transformation that makes your data look even approxi-
mately normal, then you have to analyze your data using nonparametric methods,
which don’t assume that your data are normally distributed.
Summarizing grouped data with bars,
boxes, and whiskers
Sometimes you want to show how a numerical variable differs from one group of
participants to another. For example, blood levels of a certain cardiovascular
enzyme vary among the cardiology patients at four different clinics: Clinic A, B, C,
and D. Two types of graphs are commonly used for this purpose: bar charts and
box-and-whiskers plots.
Bar charts
One simple way to display and compare the means of several groups of data is
with a bar chart, like the one shown in Figure 9-7a. Here, the bar height for each
group of patients equals the mean (or median, or geometric mean) value of the
enzyme level for patients at the clinic represented by the bar. And the bar chart
becomes even more informative if you indicate the spread of values for each clini-
cal sample by placing lines representing one SD above and below the tops of the
bars, as shown in Figure 9-7b. These lines are always referred to as error bars,
which is an unfortunate choice of words that can cause confusion when error bars
are added to a bar chart. In this case, error refers to statistical error (described in
Chapter 6).
But even with error bars, a bar chart still doesn’t provide a picture of the distribu-
tion of enzyme levels within each group. Are the values skewed? Are there outliers?
Imagine that you made a histogram for each subgroup of patients — Clinic A,
Clinic B, Clinic C, and Clinic D. But if you think about it, four histograms would take
up a lot of space. There is a solution for this! Keep reading to find out what it is.
FIGURE 9-7:
Bar charts
showing mean
values (a) and
standard
deviations (b).
© John Wiley & Sons, Inc.